GENOME ANALYSIS REPORT

Data used in this analysis

In this experiment, the start set data are:

GENERAL DESCRIPTION

This report shows all the genome analysis information including:

The taxonomy identification of problem strains as Shewanella genus was based on Silva Taxonomy results (https://www.arb-silva.de/documentation/silva-taxonomy/). The species was identified by blastn of 16S RNA gene and the problematic strain genomes comparison with 120 complete genomes described on NCBI database. The complete genome of Shewanella. sp Pdp11 and partial genome of pathogenic and saprophyte strains of Shewanella were annotated using the Dfast tool, by Clusters of Orthologous Groups (COG) and the number of genes associated with categories were determined. The transposons have been detected by the ISEScan tool (https://github.com/xiezhq/ISEScan) which was complemented with tp_finder workflow for transposase and disrupt protein identification. The genomic islands were identified using IslandViewer 4 software and the GI enrichment was analysed by DEgenes Hunter (https://github.com/seoanezonjic/ExpHunterSuite), by API option. The genome comparison of Shewanella.sp Pdp11 and pathogenic strains allow us to identify specific genes of pathogenic strains and their absence in probiotic strain.

16S RNA

The Table indicated the 16S RNA analysis of problem genomes was carried out by blatn. The top 10 for each strain with high identity and coverage value.

Ref strain identity coverage
Pdp11 NR 178284.1 99.580 100
Pdp11 NR 113967.1 99.055 100
Pdp11 NR 041296.1 99.160 100
Pdp11 NR 108852.1 99.160 100
Pdp11 NR 044863.1 98.950 100
Pdp11 NR 119141.1 98.845 100
Pdp11 NR 025267.1 98.847 100
SdM1 NR 108852.1 99.483 100
SdM1 NR 178284.1 99.353 100
SdM1 NR 044863.1 99.224 100
SdM1 NR 119141.1 98.965 100
SdM1 NR 113967.1 98.706 100
SdM1 NR 041296.1 98.836 100
SdM1 NR 116732.1 98.708 100
SdM2 NR 113582.1 99.107 100
SdM2 NR 119141.1 99.106 100
SdM2 NR 044863.1 99.018 100
SdM2 NR 104770.1 98.839 100
SH12 NR 116732.1 99.326 100
SH12 NR 074798.1 99.038 100
SH12 NR 036917.1 98.845 100
SH12 NR 044863.1 98.847 100
SH16 NR 044863.1 99.174 100
SH16 NR 178284.1 99.174 100
SH16 NR 119141.1 99.071 100
SH16 NR 113582.1 98.762 100
SH16 NR 116732.1 98.763 100
SH4 NR 116732.1 99.217 100
SH4 NR 074798.1 98.957 100
SH4 NR 036917.1 98.783 100
SH6 NR 044863.1 99.285 100
SH6 NR 119141.1 99.081 100
SH6 NR 178284.1 99.081 100
SH6 NR 113582.1 98.774 100
SH6 NR 116732.1 98.878 100
SH9 NR 044863.1 99.106 100
SH9 NR 119141.1 98.833 100

Complete genome comparison

Heatmaps represent the calculation of the Average Nucleotide Identity (ANI) by MUMmer (NUCmer) to align the input sequences, giving us as results the specific nucleotide identity (first heatmap) and coverage (second heatmap) of 0 to 1 value, between our 8 strains studies and 129 complete genomes of the Shewanella genus available at NCBI. The probiotic Shewanella Pdp11 had a high identity and coverage with Shewanella baltica genus, SH4 and SH12 with Shewanella xiamenensis, while SH6, SH16, SH9 and SdM1 have similarity with Shewanella oncorhynchi

Sybelia results

Synteny blocks of Shewanella Pdp 11 with S. baltica 128 (left) and S. putrefaciens 4H(right)

Genome annotation

The Dfast tool (https://dfast.ddbj.nig.ac.jp/) was used for the 137 complete genomes COG annotation. The heatmaps represent the number of genes associated with a COG functional category (left heatmap) and the number of genes standardization with each genome's number of Coding Sequence (CDS) represented in genes percent (gene %) (right heatmap).

Transposon

The Transposable elements (TEs) identification in 137 complete genomes was carried out by custom workflow using ISEScan(https://github.com/xiezhq/ISEScan) and Tp finder. The results showed the number of TEs identified in each genome (left barplot), the number of TEs standardized concerning several Coding Sequence (CDS) (right barplot ), as well as the repeat TEs inside the genome (Table). The presence or absence of disrupted proteins (left heatmap) and transposases (right Heatmap) at each genome was also determined, represented by a matrix of 0 and 1.

genome coordinate start tpfinder end tpfinder transposase protein disrupt protein
Shewanella Pdp11:850951:856278 2033 3326 A0A883CDA5 A0A4Y5YC98
Shewanella Pdp11:1261471:1266798 1992 3309 A0A883CDA5 A9KZG6
Shewanella Pdp11:1906106:1911827 2000 3723 A0AAE4PW88 A0AAJ2MQ60
Shewanella Pdp11:2693976:2700579 1969 4609 A0AAE4PZ99 A0A2G1ZI34
Shewanella Pdp11:2990403:2995648 1999 3241 Q8EF58 A3D7T2
Shewanella Pdp11:3053373:3058616 2049 3244 A0A9E6JBE3 A0AAP4AW47
Shewanella Pdp11:3073685:3079406 1999 3725 A0AAE4PW88 A0AAJ2J2I4
Shewanella Pdp11:4315854:4321181 2006 3320 A0A883CDA5 A0AAE4TJG1
Shewanella Pdp11:4349020:4355949 1959 3354 A0AAP2ZPU8 A0A6G9QQZ7

Genomic Island

The Mobilome analysis included the genomic island (GI) identification using IslandViewer4 (https://www.pathogenomics.sfu.ca/islandviewer/). Results shows the total of GIs identified at 137 research strains of Shewanella.sp. The Figure shows the number of genomic islands in which some functional categories are significant, determined by Dfast. The GIs were identified in 137 research strains of Shewanella.sp. Additionally, the COG categories functional enrichment was accomplished by ExpHunterSuite script clusters_to_enrichment.R, which is based on the clusterProfiler package, with FDR threshold (-p 0.05). The heatmap represents the number of GI included in COG categories (left) and this number was proportioned to the total of GI (GI %) (right).

Prophage

Prophage was analized using PHASTEST web.Then, prophages were identified in 74 complete genomes of Shewanella

Probiotic and pathogenic strains comparison

This analysis were focused on pathogenic and probiotic strains comparison by Tarsynflow, which identified the specific genes on pathogenic and their absent on probiotic strain.

Entry protein names organism
A0A073KRP2 L-lactate permease Shewanella xiamenensis
A0A380B106 L-seryl-tRNA(Sec) selenium transferase (EC 2.9.1.1) (Selenocysteine synthase) (Sec synthase) (Selenocysteinyl-tRNA(Sec) synthase) Shewanella morhuae
A0A1E3UTG5 DUF4274 domain-containing protein Shewanella xiamenensis
A0A1E3UVI4 LuxR C-terminal-related transcriptional regulator (LuxR family transcriptional regulator) Shewanella xiamenensis
A0A220US46 Two-component system response regulator Shewanella bicestrii
A0A2M7HTT6 Sodium:alanine symporter Shewanella sp. CG12_big_fil_rev_8_21_14_0_65_47_15
A0A4R2FB63 Succinate dehydrogenase/fumarate reductase flavoprotein subunit Shewanella fodinae
A0A4R2FJ85 Two-component system response regulator AdeR Shewanella fodinae
A0A501XHX2 CDP-glycerol--glycerophosphate glycerophosphotransferase Shewanella sp. LC6
A0A501XY11 Cytochrome c3 family protein Shewanella sp. LC6
A0A5B8QSE5 TetR/AcrR family transcriptional regulator Shewanella decolorationis
A0A6N3J9Z1 histidine kinase (EC 2.7.13.3) Shewanella sp. (strain W3-18-1)
A0A9X2WTL9 histidine kinase (EC 2.7.13.3) Shewanella septentrionalis
A0AAJ2J6D6 Atu1372/SO_1960 family protein Shewanella sp. SP1S1-7
A3D7Z8 Sodium/calcium exchanger membrane region Shewanella baltica (strain OS155 / ATCC BAA-1091)
A9KV88 Selenocysteine-specific translation elongation factor Shewanella baltica (strain OS195)